Small sample bias and selection bias effects in multivariate calibration, exemplified for OLS and PLS regressions

نویسنده

  • Rolf Sundberg
چکیده

In multivariate calibration by for example ordinary least squares (OLS) multiple regression or by partial least squares regression (PLSR) the predictor ŷ(x) is perfect for the calibration sample itself, in the sense that the regression of observed y on predicted ŷ(x) is y = ŷ(x). Plots of y against ŷ(x) are much used to illustrate how good the calibration is and how well prediction works. Usually and rightly, this will be combined with cross-validation. In particular, cross-validation can show that for small samples the predictor ŷ(x) will be biased, in the sense of making the regression coefficient of y on ŷ(x) less than one, typically only slightly so for PLSR but substantially for OLSR, Another bias effect appears when y-values for the calibration are more or less selected. An increase in the spread of y might appear desirable because it increases the precision in the calibration. However, the resulting selection bias can affect both PLSR and OLSR substantially, and an additional problem with this bias is that it cannot be detected by cross-validation. These bias effects will here be illustrated by resampling from a large data-set, containing measurements on 344 pigs from slaughter pig grading.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Small sample and selection bias effects in calibration under latent factor regression models

We study bias of predictors when a multivariate calibration procedure has been applied to relate a scalar y (concentration of an analyte, say) to a vector x (spectral intensities, say). The model for data is assumed to be of latent factor regression type, with multiple regression models and errors-in-variables models as special cases. The calibration procedures explicitly studied are OLSR, PLSR...

متن کامل

Interaction Between Race and Gender and Effect on Implicit Racial Bias Against Blacks

  Background and aims: <span style="color: #221e1f; font-family: Optima ...

متن کامل

Estimating the "impact" of out-of-home placement on child well-being: approaching the problem of selection bias.

This study used data on 2,453 children aged 4-17 from the National Survey of Child and Adolescent Well-Being and 5 analytic methods that adjust for selection factors to estimate the impact of out-of-home placement on children's cognitive skills and behavior problems. Methods included ordinary least squares (OLS) regressions and residualized change, simple change, difference-in-difference, and f...

متن کامل

A more accurate method of predicting soft tissue changes after mandibular setback surgery.

PURPOSE To propose a more accurate method to predict the soft tissue changes after orthognathic surgery. PATIENTS AND METHODS The subjects included 69 patients who had undergone surgical correction of Class III mandibular prognathism by mandibular setback. Two multivariate methods of forming prediction equations were examined using 134 predictor and 36 soft tissue response variables: the ordi...

متن کامل

Application of Genetic Algorithms for Pixel Selection in MIA-QSAR Studies on Anti-HIV HEPT Analogues for New Design Derivatives

Quantitative structure-activity relationship (QSAR) analysis has been carried out with a series of 107 anti-HIV HEPT compounds with antiviral activity, which was performed by chemometrics methods. Bi-dimensional images were used to calculate some pixels and multivariate image analysis was applied to QSAR modelling of the anti-HIV potential of HEPT analogues by means of multivariate calibration,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005